首页> 外文OA文献 >Efficient community identification and maintenance at multiple resolutions on distributed datastores
【2h】

Efficient community identification and maintenance at multiple resolutions on distributed datastores

机译:在分布式数据存储上以多种分辨率进行有效的社区识别和维护

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The topic of network community identification at multiple resolutions is of great interest in practice to learn high cohesive subnetworks about different subjects in a network. For instance, one might examine the interconnections among web pages, blogs and social content to identify pockets of influencers on subjects like 'Big Data', 'smart phone' or 'global warming'. With dynamic changes to its graph representation and content, the incremental maintenance of a community poses significant challenges in computation. Moreover, the intensity of community engagement can be distinguished at multiple levels, resulting in a multi-resolution community representation that has to be maintained over time. In this paper, we first formalize this problem using the k-core metric projected at multiple k-values, so that multiple community resolutions are represented with multiple k-core graphs. Recognizing that large graphs and their even larger attributed content cannot be stored and managed by a single server, we then propose distributed algorithms to construct and maintain a multi-k-core graph, implemented on the scalable Big Data platform Apache HBase. Our experimental evaluation results demonstrate orders of magnitude speedup by maintaining multi-k-core incrementally over complete reconstruction. Our algorithms thus enable practitioners to create and maintain communities at multiple resolutions on multiple subjects in rich network content simultaneously. © 2015 Elsevier B.V. All rights reserved.
机译:在实践中,以多种分辨率进行网络社区识别的主题对于学习有关网络中不同主题的高凝聚力子网络非常感兴趣。例如,人们可能会检查网页,博客和社交内容之间的相互关系,以找出影响者的“大数据”,“智能手机”或“全球变暖”等领域的影响力。随着其图形表示形式和内容的动态变化,社区的增量维护在计算中提出了重大挑战。此外,可以在多个级别上区分社区参与的强度,从而导致必须长期保持多分辨率社区代表。在本文中,我们首先使用投影在多个k值上的k核心度量来形式化此问题,以便用多个k核心图表示多个社区分辨率。认识到大图及其更大的属性内容无法由单个服务器存储和管理,因此我们提出了分布式算法来构造和维护多k核图,该算法在可扩展的大数据平台Apache HBase上实现。我们的实验评估结果表明,通过在完整重建过程中逐步保持多k核,可以加快数量级。因此,我们的算法使从业人员可以同时在丰富网络内容中的多个主题上以多种分辨率创建和维护社区。 ©2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号